NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Interpretable Failure Detection with Human-Level Concepts

https://doi.org/10.1609/aaai.v39i25.34831

Nguyen, Kien X; Li, Tang; Peng, Xi (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

Reliable failure detection holds paramount importance in safety-critical applications.Yet, neural networks are known to produce overconfident predictions for misclassified samples. As a result, it remains a problematic matter as existing confidence score functions rely on category-level signals, the logits, to detect failures. This research introduces an innovative strategy, leveraging human-level concepts for a dual purpose: to reliably detect when a model fails and to transparently interpret why.By integrating a nuanced array of signals for each category, our method enables a finer-grained assessment of the model's confidence.We present a simple yet highly effective approach based on the ordinal ranking of concept activation to the input image. Without bells and whistles, our method is able to significantly reduce the false positive rate across diverse real-world image classification benchmarks, specifically by 3.7% on ImageNet and 9.0% on EuroSAT.
more » « less
Free, publicly-accessible full text available April 11, 2026
DEAL: Disentangle and Localize Concept-Level Explanations for VLMs

https://doi.org/10.1007/978-3-031-72933-1_22

Li, Tang; Ma, Mengmeng; Peng, Xi (October 2024, In Proceedings of the European Conference on Computer Vision)

Full Text Available
Beyond the Federation: Topology-aware Federated Learning for Generalization to Unseen Clients

Ma, Mengmeng; Li, Tang; Peng, Xi (September 2024, In Proceedings of the International Conference on Machine Learning)

Full Text Available
Critical assessment of pan-genomic analysis of metagenome-assembled genomes

https://doi.org/10.1093/bib/bbac413

Li, Tang; Yin, Yanbin (September 2022, Briefings in Bioinformatics)

Abstract Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi’o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies.
more » « less
Categorization of Orthologous Gene Clusters in 92 Ascomycota Genomes Reveals Functions Important for Phytopathogenicity

https://doi.org/10.3390/jof7050337

Peterson, Daniel; Li, Tang; Calvo, Ana M.; Yin, Yanbin (May 2021, Journal of Fungi)
null (Ed.)
Phytopathogenic Ascomycota are responsible for substantial economic losses each year, destroying valuable crops. The present study aims to provide new insights into phytopathogenicity in Ascomycota from a comparative genomic perspective. This has been achieved by categorizing orthologous gene groups (orthogroups) from 68 phytopathogenic and 24 non-phytopathogenic Ascomycota genomes into three classes: Core, (pathogen or non-pathogen) group-specific, and genome-specific accessory orthogroups. We found that (i) ~20% orthogroups are group-specific and accessory in the 92 Ascomycota genomes, (ii) phytopathogenicity is not phylogenetically determined, (iii) group-specific orthogroups have more enriched functional terms than accessory orthogroups and this trend is particularly evident in phytopathogenic fungi, (iv) secreted proteins with signal peptides and horizontal gene transfers (HGTs) are the two functional terms that show the highest occurrence and significance in group-specific orthogroups, (v) a number of other functional terms are also identified to have higher significance and occurrence in group-specific orthogroups. Overall, our comparative genomics analysis determined positive enrichment existing between orthogroup classes and revealed a prediction of what genomic characteristics make an Ascomycete phytopathogenic. We conclude that genes shared by multiple phytopathogenic genomes are more important for phytopathogenicity than those that are unique in each genome.
more » « less
Full Text Available
dbCAN-PUL: a database of experimentally characterized CAZyme gene clusters and their substrates

https://doi.org/10.1093/nar/gkaa742

Ausland, Catherine; Zheng, Jinfang; Yi, Haidong; Yang, Bowen; Li, Tang; Feng, Xuehuan; Zheng, Bo; Yin, Yanbin (September 2020, Nucleic Acids Research)
null (Ed.)
Abstract PULs (polysaccharide utilization loci) are discrete gene clusters of CAZymes (Carbohydrate Active EnZymes) and other genes that work together to digest and utilize carbohydrate substrates. While PULs have been extensively characterized in Bacteroidetes, there exist PULs from other bacterial phyla, as well as archaea and metagenomes, that remain to be catalogued in a database for efficient retrieval. We have developed an online database dbCAN-PUL (http://bcb.unl.edu/dbCAN_PUL/) to display experimentally verified CAZyme-containing PULs from literature with pertinent metadata, sequences, and annotation. Compared to other online CAZyme and PUL resources, dbCAN-PUL has the following new features: (i) Batch download of PUL data by target substrate, species/genome, genus, or experimental characterization method; (ii) Annotation for each PUL that displays associated metadata such as substrate(s), experimental characterization method(s) and protein sequence information, (iii) Links to external annotation pages for CAZymes (CAZy), transporters (UniProt) and other genes, (iv) Display of homologous gene clusters in GenBank sequences via integrated MultiGeneBlast tool and (v) An integrated BLASTX service available for users to query their sequences against PUL proteins in dbCAN-PUL. With these features, dbCAN-PUL will be an important repository for CAZyme and PUL research, complementing our other web servers and databases (dbCAN2, dbCAN-seq).
more » « less
Full Text Available

Search for: All records